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Abstract 

Network dynamics may be viewed as a process of change in the edge structure 
of a network, in the vertex set on which edges are defined, or in both simul- 
taneously. Though early studies of such processes were primarily descriptive, 
recent work on this topic has increasingly turned to formal statistical mod- 
els. While showing great promise, many of these modern dynamic models 
are computationally intensive and scale very poorly in the size of the net- 
work under study and/or the number of time points considered. Likewise, 
currently employed models focus on edge dynamics, with little support for 
endogenously changing vertex sets. Here, we show how an existing approach 
based on logistic network regression can be extended to serve as highly scal- 
able framework for modeling large networks with dynamic vertex sets. We 
place this approach within a general dynamic exponential family (ERGM) 
context, clarifying the assumptions underlying the framework (and providing 
a clear path for extensions), and show how model assessment methods for 
cross-sectional networks can be extended to the dynamic case. Finally, we 
illustrate this approach on a classic data set involving interactions among 
windsurfers on a California beach. 

Keywords: dynamic networks, exponential family random graph models, 
logistic regression, vertex dynamics, model assessment 



*This work was supported in part by ONR award N00014-08- 1-1015 and National 
Science Foundation (NSF) award BCS-0827027. 

* Corresponding author 
Email addresses: almquistQuci.edu (Zack W. Almquist), buttscQuci.edu (Carter 
T. Butts) 



Preprint submitted to Social Networks 



February, 3, 2011 



1. Introduction 



Change in network structure (i.e., network dynamics) has long been a 
topic of both theoretical and methodological interest within the social net- 
work community. Network dynamics may be viewed as a process of change 
in the edge structure of a network, in the vertex set on which edges are de- 
fined, or in both simultaneously. While early studies of such processes were 
primarily descriptive (e.g., Sampson, 1968; Newcomb, 1953; Coleman, 1964), 
recent work on this topic has increasingly turned to formal statistical models 



(e.g., 


Banks and Carley 


1996 


Snijders 


1996 


son 


2001 


Krackhardt and Handcock 




2007 


). 



2001 2005; Robins and Patti- 



many of these modern dynamic models are computationally intensive and 
scale very poorly in the size of the network under study, making them diffi- 
cult or impossible to apply to large networks in practical settings. Likewise, 
currently employed models focus on edge dynamics, with little support for 
endogenously changing vertex sets. Given this situation, there is a need for 
scalable approaches that - even if limited in various ways - can serve as 
a starting point for analysis of intertemporal network data at large scales. 
This paper explores the use of the well-known logistic network regression 
framework as a simple basis for the modeling of joint edge/vertex dynamics 
with various orders of temporal dependence. We expand on past work show- 
ing how this family can be derived from the theory of Exponential Family 
Random Grap h Models (ERGMs) ([Holland and Leinhardt| |1981a||b| [Butts 



2008 Snijders 2002 Strauss and Ikeda 1990) via dependence assumptions 



in the dynamic case, and discuss computational issues related to its use with 
large, sparse graphs. We discuss basic parameterization issues, including one 
approach to the treatment of cases with vertex set dynamics. We follow this 
discussion with study in which we analyze the dynamics of interper- 

sonal communication during 31 days of windsurfer interaction on a beach 
in Southern California, the famous "beach" data-set collected by jPreeman 



et al. (1988) (hereon referred to as the beach network). Demonstrating sev- 



eral methods for assessing model adequacy, we evaluate the ability of the 
logistic family to capture the evolution of the beach network over the 31 day 
collection period. Informed by these results, we conclude by discussing some 
of the strengths and weaknesses of this approach for practical analysis of 
large-scale intertemporal data sets. 
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Although existing models for joint edge/vertex evolution are rare (an ex- 
ample being recent work by Krivitsky, 2009), basic statistical methods for 



edge prediction have been in the social network literature for several decades 
(see, e.g. Krackhardt, 1987a|[b 1988). Much of this early work involved vari- 
ations on OLS or logistic regression applied to adjacency matrices. Logistic 
regression per se has a long history of being applied to social network data 



(Robins et al. , 


1999 


Wasserman and Pattison 


1996 


Pattison and Wasser- 


man, 


1999 


Lazega and van Duijn, 


1997 


), due both to the fact that it arises 



naturally from edgewise independence assumptions (see Holland and Lein 



hardt , 1981a|[b ) and to the wide availability of existing implementations. Less 



appreciated have been the computational advantages of the logistic frame 
work relative to more complex schemes; methods for estimation of logistic 
models on large, sparse data sets are well-developed (see, e.g. Komarek and 



Moore, 2003 Komarek, 2004; Lin et al. , 2008), in contrast with currently 



available methods for general ERG models. We propose to take advantage 
of this latter property, formulating our models in a fashion that facilitates 
computation for even very large, sparse dynamic graphs. We also make use 
of available exponential family theory to derive a minimal set of assump- 
tions that leads immediately to a lagged logistic form for the joint evolution 
of edge structure and vertex set. This allows us to clarify what is being 
assumed in using such a model, thereby facilitating the assessment of its 
applicability in particular settings. Moreover, placing this family within the 
general family of dynamic ERGMs allows it to be readily expanded by the 
incorporation of alternative dependence assumptions (although not without 
computational cost). Key to our effort is the intuition that, in the dynamic 
case, the past history of the evolving network will account for much of the 
(marginal) dependence among edges — thus, the assumption of conditional 
independence of edges in the present (given the past) may be a much more 
effective approximation for incremental snapshots of evolving networks than 
for typical cross-sectional and/or marginalized network data. By leveraging 
this approximation, we can potentially account for many aspects of network 
evolution for systems whose size would prove prohibitive to more elaborate 
models. 

The overall structure of the paper is as follows: we begin by describing 
the basic background and notation for our proposed modeling framework, 
following this with a derivation of the dynamic logistic regression family 
with vertex dynamics from the general family of dynamic ERGMs under 
specified independence assumptions. We then consider computational issues. 
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including scalability and fit assessment. Finally, we illustrate the use of 
this approach (and of associated adequacy diagnostics) via an application to 
the evolution of interpersonal communication of windsurfers on a beach in 
Southern California in the late summer of of 1986. 



2. Notation and Core Concepts 

We begin by laying out the basic notation and statistical framework that 
underlies both the theoretical and methodological contributions of this work. 
This section first covers the necessary graph theoretic and matrix notation 
needed for defining the ERG models. We follow this with a brief review 
of core concepts from the ERGM literature that will be exploited in the 
subsequent sections of this paper. 

2.1. Graph Notation 

We here follow the common practice of representing structural concepts in 



a mixture of graph theoretic and statistical notation (see, e.g. Wasserman and 



Faust 


1994 


Butts, 


2008 



A graph in mathematical language is a relational 
structure consisting of two elements: a set of vertices or nodes (here used 
interchangeably), and set of vertex pairs representing ties or edges (i.e., a 
"relationship" between two vertices). Formally, this is often represented as 
G = (y, E), where V is the vertex set and E is the edge set. If G is undirected, 
then edges consist of unordered vertex pairs, with edges consisting of ordered 
pairs in the directed case; our development applies in both circumstances, 
unless noted otherwise. 

Here we will represent the number of elements in a given set with the 
cardinality operator | ■ |, such that \V\ and \E\ are the number of vertices and 
edges in G, respectively. The term for the number of vertices in a given graph 
in social network analysis is known as either order or size and is denoted 
n = \V\. As noted below, we will be considering cases in which neither E nor 

V are fixed, but evolve stochastically through time. Throughout, however, 
we will treat n as finite with probability 1, and assume that the elements of 

V are identifiable. 

A common representation of graph, G, is that of the adjacency matrix Y, 
such that Y = {yij)i<i,j<n, where yij = 1 ii i sends a tie to j, or otherwise. 
If G is undirected then its adjacency matrix is by definition symmetric, i.e. 
Vij ~ Vji'i ^ directed then its adjacency matrix is not necessarily sym- 
metric. It is common to assume that there are no self-ties (or loops) and 
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thus the diagonal is represented either as all zeros, {yu = 0, or treated as 
missing, yu = NA). This assumption is not necessary for the development 
that follows. 

A necessary addition to this notation is that of an index for time, t, such 
that Y becomes a t-indexed vector of adjacency matrices with Yt being a 
convenient shorthand for the adjacency matrix at time t, and Y^j an indicator 
for the state of i,j edge at said time. We also apply this notation to graphs, 
such that Gt = (yt,Et) denotes the state of G at time t. Our development 
assumes that G is observed at a finite number of time points (i.e., we consider 
network evolution in discrete time). 

2.2. Random Graph Models and Exponential Family Form 

When modeling networks, it is helpful to represent their distributions 
via random graphs in exponential family form. The explicit use of statisti- 
cal exponential families to represent random graph models was introduced 



by Holland and Leinhardt (1981a), with important extensions by Frank 



and Strauss (1986) and subsequent elaboration by Wasserman and Patti- 
son (1996) and others. Often misunderstood as a type of model per se, the 
ERG (exponential-family random graph) formalism is in fact a framework 
for representing distributions on graph sets, and is complete for distributions 
with countable support (i.e., one can always write such a distribution in ERG 
form, albeit not always parsimoniously). The power of this framework lies in 
the extensive body of inferential, computational, and stochastic process the- 
ory (borrowed from the general theory of discrete exponential families) that 



Nielsen , 


1978 


Brown 


1986) 



"language" for expressing and working with random graph models. 

Given a random graph G on support Q, we may write its distribution in 
exponential family form as follows: 



Pr{G = g\s, 



exp {6'^s{g)) 
Eg' eg exp {O^sig')) 



(1) 



where Pr(-) is the standard probability measure, Q is the support of G, g is 
the realized graph, s is the function of sufficient statistics, 6' is a vector of 
parameters, and Ig is the indicator function (i.e. 1 if its argument is in the 
set-space of ^, otherwise). 
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While the extreme generahty of this framework has made it attractive, 
model selection and parameter estimation are often difficult due to the nor- 
malizing factor in the denominator of Equation [l] (which is effectively incom- 
putible except in special cases such as the the Bernoulli and dyad-multinomial 
random graph families (Holland and Leinhardt| |1981a)). The first applica- 



tions of this family (stemming from Holland and Leinhardt's seminal 1981 
paper) focused on these special cases. Frank and Strauss (1986) introduced 
a more general estimation procedure based on cumulant methods, but this 
proved too unstable for general use; emphasis then switched to approxi- 
mate inference using maximum pseudo-likelihood estimation (Besag, 1974), 
as popularized in this application by Strauss and Ikeda (1990) and later 



Wasserman and Pattison (1996). Although maximum pseudo-likelihood es- 



timation (MPLE) coincides with maximum likelihood estimation (MLE) in 
the limiting case of edgewise dependence, the former was found to be a poor 
approximation to the latter in many practical settings, thus leading to a con- 
sensus against its general use (see, e.g., Besag (2001) and van Duijn et al. 



(2007)). The development of effective Markov chain Monte Carlo strategies 



for simulating draws from ERG models in the late 1990s (Anderson et al. 



1999b Snijders, 2002) led to the current focus on MLE methods based either 



on first order method of moments (which coincides with MLE for this fam- 
ily) or on importance sampling ( [Geyer and Thompson 1992). Algorithms 
for parameter estimation and model selection using these approaches are im- 
plemented in a number of software packages (see, e.g., Snijders et al. , 2007 



Handcock et al. 2003; Wang et al. 2009) 


, and empirical applications are in- 


creasingly common (e.g., Goodreau et al. 


2009 Snijders and Doreian 


2010 


Robins and Pattison 2001 


, etc.). 

le capacity of the ERGM framework to represent 


This tension between t 



computationally difficult models with substantial dependence and the need 
for models that can be deployed in practical settings has been a defining 
theme of research in this area. In this paper, our concern is primarily with 
the latter problem: we seek families of models for network dynamics that 
are computationally tractable, and easily interpreted. At the same time, 
however, we recognize the power and fiexibility of the ERGM representation, 
particularly as a tool for embedding simple models within a much broader 
family (thus paving the way for subsequent expansion). As such, we will 
draw heavily on the exponential family framework in our development, even 
when working with cases that can be represented in other ways (e.g., logistic 
regression). 
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3. Modeling Network Dynamics with Logistic Regression 



Consider a discrete time series . . . , Yq, Yi . . ., where Yi G {0, 1}. One 
approach to modehng such a series is to posit that each Yi arises as a 
Bernoulli trial whose parameter, 0j, is the inverse logit of some given func- 
tion of yi_i, Fj_2, . . . (along, perhaps, with some vector of covariates Xi). 
This model family is equivalent to logistic regression of Y involving one or 
more "lagged" terms (i.e., functions of the prior values of Y), and is thus 
referred to as lagged logistic regression (a natural analog of the Gaussian AR 
process (Brockwell and Davis, 2002 Shumway and Stoffer, 2006)). Models 
with lagged logistic form have been used for studying network dynamics, but 
the family as a whole has a higher level of generality than has been exploited 
in the social network literature. In the development that follows, we review 
and extend the derivation of an analogous family of processes for dynami- 
cally evolving network data. In keeping with the analogy, we refer here to 
the models associated with these processes as dynamic network logistic re- 
gression or lagged network logistic regression models. Although this family 
lacks the full flexibility of the general ERGMs cited above, it has the advan- 
tage of being simple, scalable, and easily extensible to the case of network 
vital dynamics (the "birth" and "death" of vertices). These features make 
this model family a natural starting point for dynamic network modeling on 
large graph sequences. Even where the family proves inadequate unto itself, 
its extensibility provides a natural path for incorporation of more complex 
forms of dependence. 

As noted, an important consideration in our development is scalability 
to graphs with large vertex sets. Recent innovations in data collection, as 
well as new forms of social interaction (e.g., online social networks) have 
greatly expanded the size of social networks available for study. While this 
has been a boon to analysts, it has also posed significant challenges: the 
computational complexity of many basic network properties grows rapidly 
with the size of the vertex set, and the Monte Carlo procedures underlying 
conventional statistical procedures for network modeling require that such 
properties be evaluated large numbers (e.g., millions) of times. These com- 
plexity problems are exacerbated in the dynamic case by the need to perform 
such computations for multiple temporal cross-sections. It is worth noting 
that computational power and algorithmic efficiency both continue to im- 
prove with time; however, at this current juncture, current implementations 
of general frameworks such as the actor-oriented models of Snijders (2001) 
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or the dynamic ERG models Krackhardt and Handcock (2007); Krivitsky 



(2009) are often impractical to apply to networks having even a few thou- 
sand nodes. Although scalability is a challenge for virtually all non-trivial 
network models, simplifying assumptions can often allow efficiency gains that 
permit the analysis of data that would otherwise be out of reach of statistical 
procedures. We now turn to a consideration of one such set of assumptions, 
which jointly imply a general conditional logistic structure for networks with 
jointly evolving edge and vertex sets. 

3.1. The Core Dependence Structure 

In the conventional, cross-sectional case where V is fixed, logistic models 
arise from the assumption that all edges are independent conditional on a 
fully-observed set of covariates (Wasserman and Robins, 2005). Although 



potentially adequate in networks with very strong covariate effects (Butts 



2003), such models are often poor approximations where covariate infor- 



mation is limited, or where complex interactive processes are the primary 



drivers of tie formation and dissolution (Goodreau et al. 2009). Consider, 



however, the case of network "panel" data, in which an evolving network is 
measured at regular intervals during its evolution. Here, too, simultaneity 



can be a problem, and specialized modeling schemes hke those of Snijders 



(2001), Krackhardt and Handcock (2007) and Krivitsky (2009) have been 



proposed to capture this dependence. If the intervals over which we measure 
the network are suitably fine, however, very little simultaneous dependence is 
likely to occur: for many systems, much of what transpires over a short time 
interval can be treated as independent given the past history of interaction, 
and suitable covariates. (Indeed, taking this logic to its infinitesimal extreme 



results in the relational event framework of Butts (2008), which exploits this 



property to measure the dynamics of event-based interaction in continuous 
time.) Where this assumption is reasonable, it may be possible to approxi- 
mate the process of network evolution by an inhomogeneous Bernoulli graph 
process in which edge states at future times depend upon the past history of 
the network, but not (conditionally) other edges at the same time point. Such 
an approximation would allow one to leverage the substantial computational 
and interpretive advantages of the General Linear Model ( GLM) framework, 
while still capturing the critical mechanisms of network evolution. 

The model family we propose is one that leverages potentially complex de- 
pendence on the past together with conditional independence in the present to 
flexibly capture network evolution in a way that nonetheless reduces to lagged 
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logistic regression. Specifically, we derive our model family from the core as- 
sumption that Et+i depends only on Vt+i and {Et, Vt), . . . , {Et-k, Vt-k), and 
Vt+i depends only on {Et, Vt), . . . , {Et-k, Vt-k), together with any exogenous 
covariates (see Figurejl]). Intuitively, this can be thought of as specifying that 
today's vertices are determined by the past network structure (out to some 
limit, k), and that today's edges are determined by both this past structure 
and today's vertices. One of the effects of this framework is that it allows 
uncertainty in network composition to be considered when making predic- 
tions. As we shall see, explicitly considering this aspect of network structure 
(which has been largely overlooked in prior research) leads to a very different 
view of network dynamics in contexts for which vertex entry and exit are 
possible. 

[ Figure [l] ] 

Although the aforementioned model family treats edges as conditionally 
independent within time steps, they may depend upon past time steps via 
arbitrary functions of previous graph realizations (up to some finite order, 
k). We call such functions of previous network states lag terms (in analogy 
with time series models), with the order of a lag term corresponding to the 
temporal difference between the earliest cross-section employed by the term 
and the current cross-section. (Thus, a first order term involves only the 
previous time step, the second involves at most the second, etc.) In general, 
our framework allows for arbitrary choice of k (and thus dependence over 
arbitrarily long lags). 

3.2. Deriving the Likelihood 

To obtain the dynamic logistic network regression representation for our 
process, we break down the derivation into two distinct parts. First we define 
the necessary assumptions for the likelihood of the relational structure of the 
graph given the vertex set, and next we define the necessary assumptions 
to derive the fully logistic structure for modeling both the vertices and the 
edg lagged logistic regression model. Note that unlike the preceding 

sections where we employed the edge set notation (E), we now apply the 
adjacency matrix notation {Y) in the following section for greater flexibility 
in handling edge set decomposition. 

We start by relaxing the temporal Markov and fixed vertex set assump- 



tions of Hanneke and Xing (2007), replacing them with weaker versions. We 
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then impose some conditional edge and vertex independence assumptions, 
and lastly we make some homogeneity assumptions. We formally specify 



these assumptions in Section |3.2.1| and |3.2.2[ combining them to derive the 
likelihood of the dynamic network logistic regression model family. 
This structure allows us two distinctive advantages over 
and others. The first advantage is that unlike 



Xing 



Xing 



(2007) 



Hanneke and 



Hanneke and 



(2007), we do not require the vertex set to be fixed and thus the num- 



ber and identity of vertices may change with time (an important factor when 
modeling emergent networks e.g. as arising following disasters, in naturally 
occurring groups, etc.). The second important distinction is that we explic- 
itly develop the dependence conditions needed for inhomogeneous Bernoulli 



structure, in comparison to Hanneke and Xing (2007) whose computational 



examples implicitly assume Bernoulli structure but who do not elaborate the 
associated theoretical assumptions. This development facilitates the expan- 
sion of the present model family by relaxation of conditional independence, 
where necessary. 

3.2.1. Part 1: Edges Given the Vertex Set 

We consider first the evolution of edges, given the vertices present in the 
network. Given a graph Gi ~ (Fj, V^) = Zi and covariate set Xt (noting 
that X may contain covariate information from prior time points) with i G 
1, . . . , t, we formally specify our assumptions below: ^ states that the state 
of the network at any given time point depends only on the states of the 
networks over some previous k time points (the relaxed temporal Markov 
assumption); ^ asserts conditional independence of edges in the same time 
slice, given past history and covariates; and (iii) and (iv) assert that the 



stochastic process generating the network is temporally homogeneous (given 
the covariates). 

(i) For some specified k > 0, Zi \ {^i-i, . . . , Zi^k, Xt} is independent of 
Zi-k-s for all 6 > 0. 

Yijk is independent Yigh given {Vi, Zi_i, . . . , Zi_k, Xt} for all j, k ^ g,h. 



(iii) Let /y be the conditional pmf of Yi (i.e., an arbitrary time slice of Y). 
For any realizable y, yi, y2, ■ ■ ■ , yk, v, vi, V2, ■ ■ ■ ,Vk then, for all i,j G 
1 t: 



fviYi = y\ Vi=v, Zi_i = zi,... 
fviYj =y\Vj = v, Zj_i = zi, 



Zi-k — Zk, Xt — Xt) — 
■ ■ , Zj-k = Zk, Xt = Xt) 
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(iv) Let fv be the conditional pmf of Vi (i.e., an arbitrary time slice of 
V). For any realizable yi,y2, ■ ■ ■ ,yk, v, vi,V2, ■ ■ ■ ,Vk then, for all i,j G 

fviYi = V I Zi_i = Zi, . . . , Zi_k = Zk, Xf = Xt) = 

fv{Vj = V I Zj_i = 21, . . . , Zj_k = Zk, Xt = Xt). 

From these assumptions, we can derive the joint likelihood of the network 
time series. We begin by applying assumption which allows us to de- 
compose the joint likelihood of the time series as a product of conditional 
distributions: 

Pr((r, V) = {y, v) I Xt) = Ulk Pr{Z, = Z, \ Z,.,, Z,_k, Xt) 

Applying assumption ^ we can further decompose the joint likelihood into 
vertex and adjacency components, the latter written as products over indi- 
vidual edge variables: 

t 

= Y[ Pr{Yi = yi I Vi, Zi_i, Zi_k, Xt) x Pr{Vi = Vi \ Zi_i, Zi_k, Xt) 

i=k 
t 

= n ^^(^ = v,\Zi_r..., Zi_k, Xt) X (2) 



i=k 



nLfc n(g,h)ey,2 Pr{Yigh - Vigh I Vi, Zj.i, . . . , Zi_k, Xt 



Homogeneity assumptions ( iii ) and ( iv ) allow the above to be written in terms 
of the pmfs fv and /y: 

= n!=fc I ■ ■ ■ 1 ^i~k, X) X nLfc Y\{g,h)(iVl friVigh \ Vi, . . . , Zi_k, Xt) 

which by the completeness of the exponential family representation for a 
binary variable leads us to 

t 

= Yl[fvivi\ Zi_i,. . . ,Zi.k, Xt) (3) 

i=k 

X ULk ll{g,h)ev^^ \ogit-\u{yigh, V, Zi_i, Zi_k, Xt)). 

Thus, each adjacency snapshot is conditionally a logistic network model, and 
is separable from the likelihood of V. 
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3.2.2. Part 2: Vital Dynamics 

There exist few inferential models in the social network literature which 
model the vital dynamics of a social network; however, vital dynamics can 
greatly influence the nature and characteristics of a given social network. We 
propose using the aforementioned dynamic logistic regression as a reasonable 
starting point. As with edge dynamics, logistic structure for vertex entry 
("birth") and exit ("death") arises naturally given a series of simplifying 
conditional independence assumptions]^ 

In order to model vital dynamics in a practical fashion, we propose the 
following additional simplifying assumptions. We begin with (|v]), which sim- 
ply states that there exists a finite set that contains all vertices at risk of 
entering the network over the entire time period 1, Next, we make an- 



other conditional independence assumption (vi) such that vertex set at time 



Vt is conditionally independent of network realizations prior to a fixed point 



in the past. We then assume (vii) that the indicator of vertex g is condition- 
ally independent of the indicator of vertex h, h g, (i.e., whether vertex g is 
present or not is conditionally independent of h) given the edges set at time 
t, the past realizations of the edge and vertex set, and exogenous covariates. 



Lastly, we make a homogeneity assumption (viii) that parallels that of the 
edge case. 



(v) There exists some finite set V^ax such that Vi C Vmax for alH G 1, 



(vi) V is independent of Zi_k_s given Zi^i, . . . , Zi_k, Xf for all 6 > 0. 

(vii) I{g G Vi) is independent of I{h G V) given . . . , for all 
g^h. 

(viii) Let fv,i be the conditional pmf of inclusion for some vertex i in some 
Vj. Then, given any realizable vi,V2, ■ ■ ■ ,Vk then, for alH G 1, . . . ,t and 
all g,he Kiax, 



/y,(%e\/,) = l|Z,_i = Z,_i,..., 

= /y,(I(/iG V.)|Z,_i = Z,_i,..., 



Zi-k ■ 
Zi~^k 



Zi-hi — ^t) 
Zi-k,Xt = Xt) 



(4) 



With assumptions (|v]), (vi), (vii), and (viii) and the exponential family ar- 
gument applied earlier, we may rewrite the left hand side of equation |3j 



^Note that we do not require that vertices can enter or exit only once, although adding 
such an assumption may be appropriate in some settings. 
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fv{Vi\V,.^,...,V,.k,Xt) = 
t 

J] fv{l{g eV,\^ge Vma.) I l^^-i, • • • , V^.-^, X^) (5) 



i=k 

t 



Y[logit-\w{I{g eVil^ge Vmax),Vi.i,...,Vi.k,Xt) 



i=k 



Thus, with these additional constraints we acquire a dual-logistic structure. 
We may thus summarize the hkehhood of the vertex portion of the model 
and the edge portion of the model in seperable terms. The vertex likelihood 
is given by 



Pr{Vt I Zt_,, Zt_k) = H logit-i {wiliv, e Vt), . . . , Z,_k)) (6) 
and the edge likelihood by 



1=1 



Pr{Yt\Vt,Zt^,,...,Zt_k)= n ^ogit-'(u{Yuj,Vt,Zi_,,...,Z,.k)), 

{i,j)eVtxVt 

(7) 

with the joint likelihood being the product of the two. A useful computational 
side effect of this is that we may use a single logistic routine to fit the entire 
model, using the augmented vector of the adjacency matrix and the temporal 
vertex indicator set (Equation [6] and [?]). 

The above provides a fairly fiexible and highly tractable framework for 
modeling joint edge/vertex dynamics, for the case in which the risk set of 
potentially appearing vertices is known (or can be approximated as such). In 
some cases, this risk set may be well-approximated by the set of all vertices 
ever appearing in the network (e.g., that the chance of a vertex being effec- 
tively at risk and never actually appearing is small). In other cases, it may be 
desirable to consider a larger population of potential actors. (We assume at 
present that this set is bounded, although extensions using dirichlet processes 



(Ferguson, 1973) or the like could be employed to generalize this framework 



to the unbounded case.) For inferential purposes, estimation for parameters 
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of both vital dynamics and edge dynamics are performed within the same lo- 
gistic regression, and are fully separable. In the case of simulation, however, 
the dependence structure illustrated in Figure [T] requires alternately sampling 
vertices and (conditionally) edges on those vertices. As this suggests, both 
edge and vertex submodels can interact in complex ways to create network 
structure, even where these models are inferentially distinct. An example of 
this interaction is shown in Section |5l 



4. Practical Considerations: Scalability, Estimation, and Model 
Adequacy Assessment 

When implementing and evaluating models of the type discussed here, 
there are several important practical considerations to be considered. First 
is the issue of scalable implementation. One advantage of the logistic frame- 
work is that there is a large body of work in computer science and machine 
learning regarding inference for logistic regression in large, sparse matrix 
settings, that can be utilized when fitting logistic models to large dynamic 
networks. Second, the issue of parameter and variance estimates is of im- 
portant concern, particularly to social scientists who employ coefficients to 
estimate the strength of putative tie formation mechanisms (as opposed, e.g., 
to "black box" forecasting). Third, some method of model evaluation is nec- 
essary so as to assure the analyst that the model captures the important 
macro-level characteristics of the graph which inform his or her theory. 

The following three sections represent an integrated discussion of these 
issues. Following this section an application of these methods is demonstrated 
on a dynamic interpersonal communication network. 

4-1- Scalability 

As noted earlier, logistic regression is a popular and well-established tech- 



nique for statistical analysis (McCuUagh and Nedler, 1999). Standard opti- 



mization techniques may be applied to logistic regression for quite large data 
sets with current technology (e.g. in Section |5] we employ a gradient based 
optimization technique on the full likelihood to an evolving network of 95 
actors, and have had success with networks larger by an order of magnitude 
or more). 

The scalability of logistic regression has been of particular interest in 
the machine learning literature, as the approach is used on a wide range 



of problems such as as neural networks and binary classification (Devroye 
I 14 



et al. , 1996). The computer science community has thus spent significant 



amounts of time and energy in developing scalable logistic regression algo- 



rithms ( 


Komarek 


2004 


Komarek and Moore 


1999 


Lin et al. 


2008 


). The current literature 



four core methods: iterative-scaling (Darroch and Ratcliff, 1972; Delia Pietra 



et al. 1997 Goodman 2002 Jin et al. 2003), nonlinear conjugate gradient 



(Vetterling and Flannery, 1992), limited memory quasi-Newton (also known 



as L-BFGS methods, Liu and Nocedal 1989; Benson and More 2001), and 



truncated- Netwon (Komarek and Moore, 2005). Malouf (2002) found that 



the limited memory quasi-Newton methods were the most efficient in a series 



of computational trials. Recently, Lin et al. (2008) have proposed and imple 



mented (Fan et al. , 2008) a trust-region Newton Method for large-scale logis- 



tic regression based on the optimization technique of Lin and More (1999) 



Lin et al. (2008) successfully apply their method to data sets with hundreds 
of millions of observations and millions of covariates. Each of these methods 
typically involve clever ways of managing the linear algebra and derivative 
problems encountered in modern optimization problems. 

All of these methods are potentially applicable to the problem of dynamic 
network logistic regression. The richness of this literature and the constant 
growth in optimization of large-scale data problems allow the methods dis- 
cussed in this paper to be applied to increasingly large data sets (large in 
time, vertex size or both). While not all network time series require such 
methods, the latter's availability makes this approach particularly useful for 
cases in which more general models would prove computationally infeasible. 

4-2. Estimation: Bayesian Analysis and Bias Reduction 

In conducting likelihood-based inference via Equations [6] and [7| both fre- 
quentist (e.g., maximum likelihood) and Bayesian approaches from the stan- 



dard literature may be employed. In test cases (like that of Section 5.1) we 
have obtained similar results from both standard maximum likelihood (ML) 
estimates and Bayesian posterior mode estimates with weakly informative 



Student's t priors (in Section 5.1 , at prior centered at with a scale parame- 
ter of 2.5 and one degree of freedom, i.e. a Cauchy distribution, is employed). 



In conventional logistic regression settings, Gelman et al. (2008) recommend 
a t prior distribution as the default choice for routine use. They argue that 
it has the advantage of always yielding a well-defined posterior estimate, and 



automatically applying more shrinkage to higher-order interactions. Gelman 



et al. (2008) derive a modified EM algorithm to produce the parameter and 
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error estimates. The analyst may interpret the resulting estimator in either 
frequentist or Bayesian terms. From a Bayesian point of view, the estimator 
in this case is the mode of the posterior distribution where all model param- 
eters are viewed a priori as having a multivariate t distribution, an estimator 
which is optimal under 0/1 loss. Within a frequentist framework, the use 
of a "prior" structure may be thought of as a bias reduction technique. As 
past work on related models has suggested that estimates of uncertainty are 
better-behaved under this alternate procedure than estimates obtained from 
the Hessian of the deviance matrix, we recommend the use of the former in 
typical settings. 

4-3. Model Adequacy Assessment and Simulation Analysis 

Model selection and assessment is a common problem in all fields em- 
ploying mathematical and statistical models. In this paper we begin by 
distinguishing between model selection and model assessment. For the for- 
mer problem, we recommend that the analyst start with the standard model 
selection techniques based on penalized log-likelihood approaches such as the 



Bayesian information criterion (BIG) (Schwarz, 1978) or Akaike information 



criterion (AIC) (Akaike, 1974) to deciding which model performs best within 
a collection of proposed models. This procedure follows standard statistical 



practice, and is reasonable well-developed; for further details see Brockwell 



and Davis (2002); Gelman et al. (2003). Given that one has identified the 



best overall candidate model, we then recommend performing simulation- 
based assessments of model adequacy to verify that the candidate captures 
the relevant properties of the original data; the approach to adequacy testing 
suggested here is an adaptation and extension of those applied in the compu- 



tational Bayesian literature ( Gelman et al. 2003 ) and the model assessment 



methods for cross-sectional network data (Hunter et al. , 2008). 



Modern network analysis often applies simulation-based methods for anal- 
ysis, prediction, exploration or model diagnostics. Simulation is typically 
used in these cases because few network models lend themselves to analytical 
treatment. In this paper we employ simulation methods in order to ascertain 
the model performance on a series of theoretically motivated network metrics 
(i.e., model adequacy assessment). 

In the machine learning literature there are a number of different ap- 
proaches to prediction, one of which is known as the 50-percent classifier rule 
(Devroye et al. , 1996). The 50-percent classifier rule is a threshold model 



(0.50) where it is assumed that an event occurs if the predicted probability 
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of the event occurs at over a half. This predictive model may be applied to 
an in homogeneous Bernoulli structure in a straightforward manner: apply 
this threshold to each predicted probability, in this case first to the vertex set 
and then to the resulting edge set predictions. It is quite natural to gener- 
alize this basic approach through simulation (i.e., apply a Bernoulli process 
to each predicted probability and use a computer to generate n predictions 
(0 or 1) from each given probability). We refer to this technique as an inho- 
mogeneous Bernoulli classifier. This method allows for a full assessment of 
predictive uncertainty of the inferred model under the assumed conditions. 

The algorithm we employ is as follows (Algorithm [T]) : for each time point 
(t) we predict n observations one-step ahead (i.e., we predict time point t from 
time t—1) by applying the aforementioned inhomogeneous Bernoulli classifier, 
where first we predict the vertex set (e.g., the vertices that we project to occur 
at time t), and then from the vertex set we predict the edge structure. Then 



we save a set of well chosen Graph Level Indices (GLI) (Wasserman and 



Faust, 1994; Anderson et al. , 1999a) (so that we are not required to store n 
graphs a t time points, which could become computationally impractical in 
many of the desired cases for this model). 

[Algorithm [l] 

The reason for concentrating on GLI distributions is twofold: first, it 
is often difficult or impractical to visually inspect thousands of simulated 
networks, nor are these easy to compare statistically in simple and practical 
terms without the use of descriptive indices. Second, it is typically the case 
that the analyst is not concerned with the occurrence of a single edge or 
vertex, but rather with the overall macro-level properties of the network (e.g. 
mean degree, triad census, centrality measures, connectedness measures, and 
so forth). Examination of a limited set of index distributions accomplishes 
the latter goal, while avoiding the former difficulty. 

After we perform the simulation procedure, we say that the proposed 
model "adequately" captures a given feature of the observed network at a 
specified level of precision a if the associated GLI value falls within the 
central a-coverage simulation interval for the model in question. The optimal 
case is naturally one in which the simulated GLI distribution is centered on 
the observed value, with little variation; for a simple model of a complex 
system, however, we may employ a looser criterion (e.g., coverage by the 
95% simulation interval for a certain fraction of time steps). Selection of 
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both GLIs to study and adequacy criteria are necessarily dependent upon 
substantive considerations (including the use to which the model is to be 
put). For example, if one's central theoretical concern is the explanation of 
transitivity in an evolving network, then ensuring that this index is well- 
accounted for by the model (in the sense of reliably included in simulation 
intervals with a < 0.95) would be critical. In the same context, one might 
be less concerned with capturing, say, mean degree, but may nevertheless 
show concern if such a basic property were not covered by wide (say, 99%) 
simulation intervals in a significant fraction of time points. For an extensive 



example of this procedure see Section 5.3 



5. Sample Application: Going to the Beach 

To illustrate the application of the dynamic network logistic regression 
approach, we employ the methods discussed in this paper to the analysis 
of a classic network data set. This data involves a dynamically evolving 
network of interpersonal communication among individuals congregating on 



a beach in Southern California observed over a one-month period (Freeman 



et al. 


1988 


Freeman 


1992 



Freeman 1992). Interpersonal communication in small groups 
is a well studied subfield in social psychology and social network analysis 
(Festinger and Thibaut, 1951). The importance of studying interpersonal 



communication networks dynamically was originally pioneered by Nordlie 



(1958) and Newcomb (1961); here, we show how the dynamic logistic family 
allows us to flexibly model the evolving network, with particular emphasis 
on the interplay between tie structure and vertex set dynamics. 

5.1. Data 

The data analyzed in the following sections was originally collected and 



analyzed in aggregate by Freeman et al. (|1988|) and has since been used in 
a number of influential articles (see 
20031 iZeggelink et all [19961 etc. 



Cornwelb 2009 Hummon and Doreian 



While this network is typically analyzed 
in aggregate, it was originally collected as a dynamically evolving network 
(where the vertex set is composed of windsurfers and the edge set is composed 
of interpersonal communication). The network was collected daily (sampled 
at two time points each day) for 31 days (August 28, 1986 to September 27, 
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1986)g 

Individuals were tracked with a unique ID, and were divided by Freeman 
et al. into those we will here call "regulars" {N = 54) - frequent attendees 
who were well-integrated into the social life of the beach community - and 
"irregulars" {N = 41) on ethnographic grounds. The former category was 
further broken down by the researchers into two groups, Group 1 {N = 22) 
and Group 2 {N = 21), with 11 individuals not classified as belonging to 
either Group 1 or Group 2. Altogether, the union of vertex sets (Knax) 
consists of 95 individuals. On any given day during the observation period, 
the number of windsurfers appearing on the beach ranged from 3 to 37, with 
the number of communication ties per day ranging from to 96. 

These basic characteristics will be used in the illustrative analysis that 
follows, which centers on the question of what drives the evolution of inter- 
personal communication in this open, uncontrolled setting. 

5.2. Mechanisms of Dynamic Interpersonal Communication 

A number of distinctive mechanisms may influence whether a windsurfer 
engages another windsurfer at any given time; however, two windsurfers 
clearly cannot interact if both do not appear simultaneously on the beach, 
and thus the first influences to be considered are those affecting the vertex 
set. For this illustrative analysis, we propose four basic mechanisms for the 
propensity of an individual to appear on a given day: (1) a regularity effect; 
(2) an inertial network effect (e.g., the lag term); (3) a three-cycle effect (be- 
cause this graph is symmetric this is equivalent to a triadic term); and (4) 
seasonal effects (e.g., day of week). An intuitive summary of each mechanism 
follows. 

Of the four mechanisms we consider as drivers of vertex set dynamics, 
the first is regularity, the notion that an individual is more likely to appear 
on any given day if he or she is one of the individuals who is classified (on 
ethnographic grounds) as belonging to the category of "regulars" who form 
the core of the beach community. This recognizes the fact (known from the 
observational accounts) that there is heterogeneity among the windsurfers, 
with certain individuals being much more active than others. 



^Unfortunately, one day (September 21st) is missing due to a race on a different beach, 
which precluded data collection. Thus, complete data is available for 30 days during the 
observation period. 
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The second posited mechanism is one of persistence or inertia - i.e., 
if an individual is active today, he or she is more hkely to be active or 
have tomorrow. This is sometimes known in the social network literature as 
"behavioral inertia" and has been seen both empirically and experimentally 
in varied social network contexts (Corten and Buskens, 2010). 

The third mechanism is a triangle effect, where the number of three-cycles 
in which an individual is embedded at point t — k influences the likelihood 
of whether an individual will appear on day t. This may be thought of as 
capturing the effect of social participation, with the intuition that persons 
embedded in dense social groups (e.g., cliques) are more likely to have their 
attendance reinforced, and thus to return to the beach. 

The fourth mechanism is seasonality, i.e. the tendency for activity to 
show systematic variation over daily or weekly cycles. Cyclic phenomena are 
common in human systems, as has long been recognized in the time series lit- 
erature (Shumway and Stoffer, 2006). Common seasonal effects in behavioral 
data include daily and hourly effects (e.g., differences between weekday and 
weekends, or midnight versus midday). Networks are no exception to this 
rule, as evidenced by Baker's (1984) observation of daily variation in struc- 
ture and activity within trading networks in a national securities market, and 
Butts and Cross's (2009) finding that the volatility of evolving blog citation 
networks changes with time of day, day of week, and external events (in that 
particular case, phases of the 2004 US electoral cycle). In the present case, 
a parallel phenomenon may occur through weekly cycles in the frequency 
of attendance at the beach (a reasonable expectation given the institutional 
context of work and leisure time for the study population during this period). 

Once the vertex set arises, the influence of a new set of interpersonal 
communication mechanisms becomes relevant. Of the many potential mech- 
anisms that could govern interpersonal communication in the study popula- 
tion, we here explore six: (1) regularity of beach use and other assortative 
mixing effects; (2) individual propensity effects for regularly occurring indi- 
viduals; (3) contagious participation; (4) inertial network effects (e.g., the 
lag term); (5) embeddedness; and (6) seasonal effects. As with the vertex 
model, we briefly consider each of these in turn. 

The first mechanism is assortative mixing between those identified as reg- 
ular beach goers and those who were classified as irregulars. In the social 
network literature, effects of a priori group partitioning on tie formation are 
often referred to as mixing effects. McPherson et al. (2001) review extensive 
evidence that individuals cluster on homophilous grounds, and thus we might 
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expect that those more deeply embedded in the miheu of the beach environ- 
ment (the "regulars") will be more likely to talk with others of the same 
ilk (and, likewise, that outsiders will be more likely to interact with other 
outsiders). Furthermore, among the regulars, those identified as belonging 
to the same core groups by the ethnographic observers are conjectured to 
mix at higher rates, ceteris paribus, than others. 

The second mechanism consists of individual-level heterogeneity in the 
propensity of regular attendees to engage in communication with others. 
We might expect that idiosyncratic shyness or gregariousness of regularly 
occurring individuals may influence the amount of activity on a given day. 
Similar to the argument applied for the first mechanism we might expect the 
basic propensity of a regular attendee to engage or not engage other beach 
members to be highly influential on the amount of activity on any given day. 

The third mechanism is contagious participation, based on the notion that 
high levels of beach-going activity at the group level are likely to translate into 
high levels of other activity (including communication). Thus, we take the 
number of persons present itself as a predictor of the propensity of individuals 
to communicate with others on the beach. 

The fourth mechanism is persistence or inertia - i.e., if an individual is 
active or has a relation today, he or she is more likely to be active or have a 
relationship tomorrow. 



The fifth mechanism is emheddedness (see Granovetter, 1985). A dyadic 
relationship which is embedded within a broader communicative context - 
e.g., in which there persons in question are linked by numerous past chains 
of communication - is likely stronger and more likely to persist at a later 
time point than one lacking such a context. We measure embeddedness by 
the number of /c-cycles within which a given relation is embedded. 

The sixth mechanism is seasonality, here in the propensity to form ties 
rather than the tendency to appear at the beach. This might arise for a 
number of reasons, e.g., systematic variation in the sorts of people who go on 
weekdays versus weekends, differences in activities pursued during weekday 
versus weekend excursions, and so forth. 

Each of the proposed mechanisms for both vertex formation and edge cre- 
ation may or may not be important to the network structure, which brings up 
the necessary process of model selection and model adequacy assessment. In 
the following sections we employ first a penalized deviation model assessment 
to select the best fitting model. We then employ a series of simulation-based 
model adequacy checks as discussed in Section |4.3| to assess the extent to 
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which the selected model does or does not capture important features of the 
evolving network. 



5.3. Model Selection and Adequacy 
5.3.1. Parameterization 

To implement our model, the impact of each of the mechanisms in Sec- 
tion 5^ is operationalized as a weight or parameter in the dynamic logistic 
regression framework. The first step in the model-building process is to select 
the vertex mechanisms, which are highly influential in this context because 
the vertex portion of the model predicts "who shows up to the party" (so 
to speak), and thus who is eligible to interact at a given time point. The 
importance of "who shows up" will greatly depend on the context and actor- 
specific covariates in a given dynamic network. For the beach data (as we 
will see) the most important attribute that an individual carries with him or 
her is whether or not he or she is a regular beach attendee (and which group 
within the regular attendees he or she is). If more information about these 
windsurfers had been collected we might, for example, expect there to be a 
gender effect and/or a "couple" effect. It should also be noted, however, that 
individuals carry more with them than their exogenous covariates: insofar as 
an individual's interaction history affects his or her probability of communi- 
cating with others, he or she is less substitutable with peers having different 
histories of interaction. Thus, correct prediction of individual attendance can 
be important even in settings for which exogenous covariates are limited (or 
altogether absent). 

In addition to specifying putative mechanisms, our vertex model requires 
specification of the risk set (V^ax), i-e. the set of persons effectively at risk for 
showing up on a given day. Here, we treat all individuals observed at any time 
during the data collection window as our risk set, lacking other information 
on potential attendance. While this is obviously a simplification, we view 
the total set of all persons appearing over an entire month as a reasonable 
proxy for the unobserved collection of persons with a non-small chance of 
appearing on any given day. 

As with other exponential family models, we capture the effects of puta- 
tive mechanisms by statistics that (together with their associated parame- 
ters) determine the probability that an edge or vertex will appear at a given 
point in time. In describing these statistics, we employ the following nota- 
tion. Within this section, t, i, and j jointly index the adjacency structure, 
so that e.g. Yuj represents the edge between the ith. and jth vertices of V^ax 
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at time t. Time itself is indexed in integer increments from 1, . . . ,T, e.g., 
T = 31 for the beach network. We will frequently use k to represent lags, 
e.g. with Yt-k representing the state of the edge set at time t — k. The vertex 



and edge model statistics themselves follow the basic form of Section |3.2 
with w'-^p{V,Y, X) being a generic function for a statistic at vertex p, and 
u^j^j{V,Y, X) being a generic function for a statistic at edge ij. X represents 
the relevant covariates for a vertex or edge (i.e., X'^ is a dichotomous variable 
for whether Vp is a regular (r) or irregular (5); X^j is a dichotomous variable 
for whether edge ij is regular (r), irregular (5), or regular to irregular and 
visa versa (0); and X^^ is the day (Monday,. . . , Sunday) at time t for vertex 
Vi and Xf-j is the day at time t for edge ij. For simplicity in notation, we 
also define two measures: (1) Ttp = the count of triangles within which Vp is 
embedded at time t; and (2) Q^j = the count of e-length cycles within which 
edge ij is embedded. 

To implement our covariate effects, we employ a series of dummy variables 
for whether an individual is in the regular category or in Group 1 category 
{wlp{V,Y, X) = Xp^ For the inertial mechanism, we employ a single lag 
term with the basic interpretation that if this weight is positive than an 
individual is more likely appear on a given day if he or she was at the beach 
the day before {wlp(y,Y, X) = I{vp G Vt-k}, i.e. one if the focal actor was 
present at time t — k and zero otherwise). For the triangle effect we employ 
three-cycle lag statistic with the interpretation that a vertex is more likely 
to appear on a given day if he or she was embedded in a triangle relation 
the day before {w^{V,Y, X) = Tt-k,p, i.e. the number of 3-cliques in which 
the focal actor participated at time t — k). We employ a dummy variable for 
each day of the week, thus allowing for higher or lower likelihood of every 
individual appearing on a given day of the week {wlpiV.Y, X) = (J{Xfp = 
Tuesday}, . . . ,I{X^p = Sunday}), with Monday as the reference category). 

As with the vertex model, operationalization of the edge model is per- 
formed by mapping the putative edge formation mechanisms onto a set of 
sufficient statistics. Per the previous discussion, the mechanism of assorta- 
tive mixing between regulars and irregulars is implemented as three statistics 
{ul,^{V,Y,X) = Xr, u^V,Y,X) = Xj, and u%{V,Y,X) = Xf^) where the 
first represents the baseline effect of regular to regular interaction, the second 



■^We also tested Group 2, but this group was not particularly influential in either the 
vertex predication or interaction of individuals. 
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represents irregular to irregular interaction and the last represents regular to 
irregular (and visa versa) interaction (noting that this term stands in place of 
the standard intercept term). The mechanism of individual-level heterogene- 
ity is implemented as a dummy variable for each of the Group 1 members 
{''^tij — ^{^j '^^ ^ Group 1 })j^ The mechanism of contagious partici- 
pation is implemented as a density effect {u1^j(y,Y, X) = log(|Vt|)) which 
changes dynamically based on the log of the number of individuals at the 
beach on the given day of interest (exploiting the fact that, because each 
day's edge realization is conditioned on that day's vertex set, properties of 
the latter can be used to predict the former). The mechanism of inertia is 
implemented as a single lag term {u\^j(y, Y, X) = Yt-k,ij)- The embeddedness 
effect is implemented as the log of the dyadic count of the number of cycles 
(up to 9) of the lagged network {ul^j{V,Y,X) = \og{Q_J.^j + 1)), with the 
interpretation that a dyadic interaction is more or less likely if the edge ex- 
isted yesterday and was in more or fewer cycles (depending on the sign of the 
weight). The last mechanism, seasonality, is again implemented as a series 
of dummy variables for each day of the week, with Monday as the reference 
category «^.(V,F,X) = (IjX^ = Tuesday}, . . . J{Xi^ = Tuesday})). 

5.3.2. Model Fit 

Each mechanism proposed in Section [5l2 may or may not influence whether 



a windsurfer arrives on a given day and/or whether or not he or she interacts 
with another windsurfer on that given day; thus it is worth applying some 
generally accepted model selection procedure in order to choose the model 
with the best combination of influences. We interpret any mechanism not 
selected through this procedure as one that is not influential in this process 
(i.e., we reject the hypothesis that the mechanism is a substantial factor in 
shaping the evolution of this network, net of other mechanisms). In this 
particular case we perform model selection using the BIG score, selecting the 
model in which the BIG is lowest (it may be seen that the full model is the 
best fitting model under this criterion, see Table |2] or Table [3] and therefore 
each mechanism proposed is tested directly). 

Overall, we find that the best-fitting model for the vertex process is one 
that incorporates differential base rates of attendance for "regulars" and 



''We also tested all regulars and Group 1 and Group 2 individuals, and just Group 2 
individuals, but found that Group 1 individuals were the set of most influential actors in 
this case. 
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(above and beyond this) for members of Group 1, as well as simple inertia, 
prior participation in cohesive conversation subgroups, and weekly season- 
ality. Thus, the data support the contention that all of the conjectured 
mechanisms for the attendance process are active in this case. For the edge 
process, we likewise find that all conjectured mechanisms - mixing, individ- 
ual heterogeneity, contagious participation, inertia, prior embeddedness, and 
seasonality - are active in governing who communicates with whom (condi- 
tional on who shows up). Interpretation of model parameters is discussed 
below. 

5. 3. 3. Model Adequacy 

To evaluate the model adequacy of the best fitting model, we employ 
simulation-based one-step prediction under a inhomogeneous Bernoulli clas- 



sifier as discussed in Section |4.3[ While the selected model may be the best 
fitting of those available, we are also interested in assessing the extent to 
which it can effectively capture the properties of the evolving beach network 
per se; significant failures in this regard may suggest the need for for further 
elaboration. In the present case, we begin with simple network features such 
size and density (and, therefore, mean degree). In the context of interper- 
sonal communication on a beach, capturing local group structure is also of 
interest; thus we include the statistics of the undirected triad census (null, 
dyad, two path, and triangle) as targets for evaluatioiij^ To evaluate our abil- 
ity to capture inequality in communication, we include degree centralization 



(Freeman, 1979). And, lastly, we may be interested in our ability to predict 
the extent to which the communication network formed on a given day will 
be well-connected, a feature that we examine using the Krackhardt connect- 



edness statistic (Krackhardt 1994[ ). The simulation intervals for each GLI 



under the best fitting model (Model 4; Figure pi) perform reasonably well 



under the criterion suggested in Section 4.3 (a < 0.95). Under the 0.95 crite- 
rion, our model performs reasonably well; in Table [T] we see that the observed 
GLI falls within the interval over 26 of the 28 predicted time points for all 
but Mean Degree (and in fact falls within the interval all 28 times for 5 of 9 
GLIs). 

[Table [l) 



^It is known that the triad census governs a number of key network statistics, such as 



transitivity; see also Faust (2010). 
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[Figure |2) 



Lastly we perform a 5-step prediction of the full model (Figure |3| as 
form of visual analysis to verify that the model is not producing degenerate 



structures; these could include, for example, those identified by Robins et al 



(2005), such as giant "clumps," so called "caveman" graphs, or other highly 
clustered graphs. Such structures are largely considered pathological and 
unrepresentative of "real-world" social networks, and (more importantly) 
do not resemble the types of networks arising within our observed data. 
Inspection of the graphs generated through the 5-step prediction verifies that 
the networks predicted by the model are non-pathological, either in terms of 
converging to an unrepresentative canonical structure (as in the Robins et al. 
case), or in producing effectively random graphs with less structure than the 
observed data. Taken together with the GLI-based adequacy checks, these 
results suggest that the model is indeed doing a reasonable job of capturing 
the core features of the evolving network. 

[Figure [3] 

5.4- Parameter Interpretation 

The parameter estimates presented in Tables |2] and |3] are interpreted in 
terms of the influence of the mechanisms proposed in Section [5^2 To simplify 



presentation, we discuss these mechanisms in two parts, starting with vertex 
mechanisms and proceeding to mechanisms associated with the edge set. 

5.5. Vertex Mechanisms 

We proposed three basic mechanisms for the vertex set dynamics in this 
particular context (Table [2] Model 4). The first was whether or not an 
individual's group status was predictive of attendance. As expected, we find 
that being a "regular" has a significant and positive influence over whether an 
individual is likely to appear on any given day (versus being an "irregular" ) , 
with those in Group 1 being even more likely to appear. Mechanism two was 
that being present at the beach on the prior day before would make one more 
likely to appear at the beach on the next day, which is indeed what we find 
(the weight is again positive and significant). Similarly, if one is engaged 
in a conversational clique the day before then one is even more likely to 
appear the next day than if one is simply present; indeed, each 3-clique 
in which an individual participates increases his or her conditional odds of 
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subsequent attendance by over 40%. Finally, we see that beach attendance 
is indeed highly seasonal: with the exception of a slight bump on Thursday, 
weekends are substantially more popular times for beach-going than the work 
week (Tuesdays, in particular). These seasonal effects are comparable in 
magnitude to the effect of being a regular, and exceed the effect of inertia 
(although inertia combined with participation in 1-2 conversation clusters 
has a similar overall effect). 

[Table |2) 

5.5.1. Edge Mechanisms 

We proposed five basic mechanisms shaping whether or not a beach goer 
was likely to engage in interpersonal communication (Table |3j Model 4), 
starting with assortative mixing of regulars (and group members). The mix- 
ing hypothesis is confirmed such that regulars are more likely to interact 
with other regulars, but refuted in the sense that irregulars are more likely 
to interact with regulars than with other irregulars. This suggests a core- 
periphery phenomenon, wherein irregulars are more likely to interact with 
"core" regulars who go to the beach more often and are more likely to be 
knowledgeable of the sport and area. Mechanism four, individual differences 
within the most influential group (Group 1) is confirmed: all but one indi- 
vidual is significantly more likely to interact or less likely to interact than 
baseline. This occurs at substantially high levels (as much as a plus 2.5 
times or down to as low as negative 1.14 times). Mechanism three, conta- 
gious participation, is highly influential and is both positive and significant. 
The inertial hypothesis is confirmed since the lag and the cycle term are pos- 
itive and significant (it should be pointed out that that the number of cycles 
cumulative up to 9 that a dyad may be involved in can be quite large (e.g. 
in the thousands) and thus this term can be quite infiuential). 

For mechanism five, it is important to point out that many of these terms 
cannot be interpreted independently. For example, everyone regardless of 
their categorization of "regularity" is infiuenced by the number of individuals 
on the beach on a given day. To put this in perspective, take the highest num- 
ber of individuals to appear on the beach over the 31 days (37 individuals) so 
that log(37) -2.72 = 4.30 and compare it to the lowest log(3) -2.72 = 2.99. To 
fully grasp how this interacts with the days of week it is important to note 
that network size is highly correlated with the day of the week and thus we 
find that there are more individuals on the beach on a typical Saturday or 
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Sunday than on a typical weekday (e.g., the lowest day occurs on a Wednes- 
day and the highest day occurs on a Sunday), such that the total effect on 
baseline density at the high end is log(37) ■ 2.72 — 1.69 = 8.13 versus a total 
lowest day effect of log(3) ■ 2.72 + 1.14 = 4.13. Thus the baseline propensity 
for interaction is given almost twice the boost (on logit scale) on the day 
with largest number of beach goers versus the day with the smallest num- 
ber of beach goers. We therefore observe that, as the beach becomes more 
populated, the chance of interacting with any given individual increases, and 
therefore we find evidence for our hypothesis of contagious participation. 

[Table |3| 

Once the set of beach goers is chosen, the important factors which predict 
if any two or more individuals will interact stem from his or her ethnograph- 
ically defined group (i.e., the "regulars;" this is especially true if he or she is 
part of Group 1) with the specific effect that all individuals regardless of sta- 
tus are likely to interact with the regular attendees. Individual differences in 
baseline propensity to interact are important, but only for Group 1 members, 
where this can be a quite large effect. Thus, predicting which of the Group 
1 members will appear on a given day is identified as an important factor in 
model success. The base activity in the network is greatly influenced by the 
number of individuals who appear on a given day, with a higher probability 
for interaction between every individual on the beach. Finally, an individual 
engaged in activity in the immediate past is more likely to engage in activity 
again in the present, and this effect is magnified if the individual is embedded 
in a larger conversational structures. 

6. Discussion and Conclusion 

The dynamic network logistic regression framework proposed in this ar- 
ticle builds on a number of well-established concepts in the social network 
literature. We have extended this prior work by incorporating vital dynam- 
ics, clarifying the assumptions needed to model joint vertex/edge dynamics 
in logistic form, and addressing practical issues such as model assessment 
and scalability. Applying the resulting framework to a dynamic network, 
we illustrated how this approach allows us to identify mechanisms underly- 
ing both individual presence/absence and relationships in a straightforward 
fashion. 
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Based on our model adequacy checks, we find that our proposed model 
does a reasonable job of capturing many properties of the beach data, despite 
the lack of available covariates (e.g., age, race, prior relationships) that would 
undoubtedly facilitate prediction. Notwithstanding our model's limitations, 
we find that the mechanisms most important to prediction of dynamic net- 
work collaboration in the Southern California beach data are assortative 
mixing, inertia (in dyadic sense and in the number of cycles one is engaged 
in), individual differences of key players, the size of the network itself, and 
seasonality. As expected, we find that those identified ethnographically as 
core members of the beach community are more likely to be present on a 
given day, along with factors such as having been active on a previous day, 
and having been previously involved in group interaction. We also find that 
the day of the week greatly infiuences the number of individuals who appear 
on any given day. 

We have noted repeatedly throughout the paper that a good vertex set 
model is key to effective prediction of joint vertex/edge set evolution, a fact 



that can be dramatically illustrated by comparing the model of Section 5^ 
to a similar model for which the vertex set is fixed to Vmax (i-e., assuming 
all actors are eligible to interact) and the best edge model. The results are 
shown in Figure |4j Notice that the model simulation intervals never cover the 
observed statistics, and are often so far from the observed values that they do 
not fall within the range of the observed statistics over the entire observation 
period (Figure [2]). A naive approaches to solving the vertex problem clearly 
will not work in this context. 

[Figure [i] 

Comparing performance of our best-fit model to a naive model without a 
well-specified vertex component underscores the critical interaction between 
the size and composition of the vertex set and the structure of the result- 
ing relationships. In particular, we find that models that do not accurately 
capture vertex set dynamics are deeply pathological for predicting other as- 
pects of structure as well: one simply cannot get the edge set right without 
first modeling the vertex set. Since vertex set models are rarely employed 
at present, this observation calls into question the trustworthiness of the 
current generation of dynamic network models. While more research is cer- 
tainly needed on this point, our experience thus far has strongly suggested 
that predictive adequacy for dynamic network models in realistic settings 
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will depend as much or more heavily on capturing the factors that lead to 
individual presence and participation than on modeling the factors that lead 
participating individuals to interact. This implies a substantial rethink of 
our current ideas regarding network evolution. 

Although we believe that the logistic framework pursued here is both flex- 
ible and powerful, we wish to end on a note of moderation. There may well be 
settings for which the available historical data does not adequately account 
for dependence among edges (or vertices), and for which the logistic approx- 
imation will perform poorly. Likewise, some research questions may require 
a degree of predictive accuracy that cannot be readily obtained without in- 
corporating simultaneous dependence. For these problems, the framework 
presented here should be viewed as a "first cut" family of models, to be 
extended by the incorporation of additional dependence terms in a manner 
analogous to the extension of Bernoulli graph models in the cross-sectional 
ERGM case. That said, considerable progress may be made by beginning 
one's investigation with models based on conditional independence assump- 
tions, and adding dependence terms only as needed to obtain acceptable 
results. Since the dynamic logistic models can be easily manipulated (and 
understood), they are well-suited to exploratory analysis, and to tasks such 
as the identification of key covariates. They also scale readily to large data 
sets, making them applicable in settings for which models with edgewise 
dependence are too computationally expensive to be employed. These ad- 
vantages make the dynamic logistic family an important and useful tool in 
the analyst's arsenal, as part of the growing family of techniques for modeling 
the dynamics of social structure. 
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Algorithm 1 Inhomogeneous Bernoulli classifier 
1: for i = 1 to m do 
2: for t = A; to T - 1 do 

3: Tr^_^i = Predicted vertex probabilities from model 
4: p = 
5: for / = 1 to n do 
6: if Bernoulli('n-JYi,;) 1 then 

7: Vt+i[p]=Vi 
8: p = p+1 

9: end if 

10: end for 

11: 7rj+i|T4+i = Predicted edge probabilities from model 
12: Yt+i = Bernoulligraph(7r(^i|Vt+i) 
13: Save[i,t] = GLI{Yt+i) 
14: end for 
15: end for 

16: {T — number of time points, GLI(-) is function which returns a GLI 

or a vector of GLls, Save is an m by t matrix, and 7if_^_i represent the 
predicted probabilities, where v denotes the predicted probabilities of the 
vertex set and e denotes the predicted probabilities of the edge set.} 



38 



Dependence Diagram 




Figure 1: Representation of the dependence graph of the cross-sectional vertex and edge 
sets under the assumptions of Section 3.2 t represents time and k represents the number 
of lags. 
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GLI One-Step Prediction Simulation Count (a < 0.95) 



O T T 


^ Correct 


Network Size 


26 


Density 


28 


Mean Degree 


28 


Degree Centralization 


20 


Krackhardt Connectedness 


28 


Triad Census: 


28 


Triad Census: 1 


26 


Triad Census: 2 


28 


Triad Census: 3 
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Table 1: Check of whether the a < 0.95 simulation interval contains a given GLI. Total 
possible correct is 28. 
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